SEJITS: Getting Productivity and Performance With Selective Embedded JIT Specialization
نویسندگان
چکیده
Today’s “high productivity” programming languages such as Python lack the performance of harder-toprogram “efficiency” languages (CUDA, Cilk, C with OpenMP) that can exploit extensive programmer knowledge of parallel hardware architectures. We combine efficiency-language performance with productivitylanguage programmability using selective embedded just-in-time specialization (SEJITS). At runtime, we specialize (generate, compile, and execute efficiencylanguage source code for) an application-specific and platform-specific subset of a productivity language, largely invisibly to the application programmer. Because the specialization machinery is implemented in the productivity language itself, it is easy for efficiency programmers to incrementally add specializers for new domain abstractions, new hardware, or both. SEJITS has the potential to bridge productivity-layer research and efficiency-layer research, allowing domain experts to exploit different parallel hardware architectures with a fraction of the programmer time and effort usually required.
منابع مشابه
Bringing Parallel Performance to Python with Domain-Specific Selective Embedded Just-in-Time Specialization
Today’s productivity programmers, such as scientists who need to write code to do science, are typically forced to choose between productive and maintainable code with modest performance (e.g. Python plus native libraries such as SciPy [SciPy]) or complex, brittle, hardware-specific code that entangles application logic with performance concerns but runs two to three orders of magnitude faster ...
متن کاملEnabling Inter-Machine Parallelism in High-Level Languages with SEJITS and MapReduce
Selective, embedded, just-in-time specialization (SEJITS) is a technique for optimizing embedded domain-specific languages through the use of specializers, or code modules developed by expert programmers that target particular accelerators such as multicore processors and GPUs via justin-time compilation. We extend SEJITS to exploit intermachine parallelism by targeting clusters of machines via...
متن کاملA Framework for Productive, Efficient and Portable Parallel Computing
A Framework for Productive, Efficient and Portable Parallel Computing by Ekaterina I. Gonina Doctor of Philosophy in Computer Science University of California, Berkeley Professor Kurt Keutzer, Chair Developing efficient parallel implementations and fully utilizing the available resources of parallel platforms is now required for software applications to scale to new generations of processors. Y...
متن کاملEfficient Parallel Graph Algorithms in Python
Domain experts in a variety of fields utilize large-scale graph analysis; however, creating high-performance parallel graph applications currently involves expertise in both graph theory and parallel programming which might not be available to the domain specialist. This project explores methods for bringing efficient parallel performance to graph applications written in Python using selective ...
متن کاملAuto-tuning the Matrix Powers Kernel with SEJITS
The matrix powers kernel, used in communication-avoiding Krylov subspace methods, requires runtime auto-tuning for best performance. We demonstrate how the SEJITS (Selective Embedded Just-InTime Specialization) approach can be used to deliver a high-performance and performance-portable implementation of the matrix powers kernel to application authors, while separating their high-level concerns ...
متن کامل